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ABSTRACT 

We discuss the clustering properties of galaxies with signs of ongoing star formation detected by the Spitzer 
Space Telescope at 24 //m band in the SWIRE Lockman Hole field. The sample of mid-IR-selected galaxies 
includes ~ 20000 objects detected above a flux threshold of S24^m = 310/iJy. We adopt optical/near-IR color 
selection criteria to split the sample into the lower-redshift and higher-redshift galaxy populations. We measure 
the angular correlation function on scales of 9 — 0.01 - 3.5 deg, from which, using the Limber inversion 
along with the redshift d istribution established for similarly selected source populations in the GOODS fields 
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Rodig hiero et al. 2010b, we obtain comoving correlation lengths of ro = 4.98 ± 0.28 h Mpc and ro = 8.04 : 
.69/r'Mpc for the low-z (z mea n = 0.7) and high-z (z mean = L7) subsamples, respectively. Comparing 
these measurem ents with the co rrelation functions of dark matter halos identified in the Bolshoi cosmological 
simulation (Kly pin et al.||20lT) , we find that the high-redshift objects reside in progressively more massive 
halos reaching M tot > 3 X 10 /T 1 M Q , compared to M tot > 7 X 10 11 h~ x M for the low-redshift population. 
Approximate estimates of the IR luminosities based on the catalogs of 24 pm sources in the GOODS fields 
show that our high-z subsample represents a population of "distant ULIRGs" with Lir > 1O 12 L , while the 
low-z subsample mainly consists of "LIRGs", Lir ~ 10 n L Q . The comparison of number density of the 24pm 
selected galaxies and of dark matter halos with derived minimum mass M tot shows that only 20% of such halos 
may host star-forming galaxies. 



1. INTRODUCTION 



The cosmic infrar ed background (CIB; [Puget et al.| l996; 



Hauser et al. 1998) accounts for approximately half of the 
total extragalactic background energy integrated over cos- 



mic time an d wavelengths (e.g., [Dole et aT]|2006| |Hauser & 
|Dwek|200l) . The CIB emission is mainly contributed by star- 
forming galaxies where optical-UV light from young stel- 
lar populations is absorbed by dust and re-emitted at longer 
wavelengths. The IR-energy output per unit volume must 
strongly in crease with redshift to account for the tota l mea- 
sured CIB ( jHauser & Dwek|2001||Lagache et aL|2005) . In- 
deed, observations with the Infrared Space Observatory (ISO; 



Genzel & Cesar sky 2000) and the Spitzer Space Telescope 
( Wern er et al.||2004| ) rev ealed large number of distant mid 



and far-infrared sources ( Chary & Elbaz 2001 Elbaz et al. 
2002 Le Floc'h et al. 2005). According to the current con- 
sensus iTTom~b^m~fh^orelicaT and observational studies, major 
developments in the evolution of galaxies in the universe hap- 
pened at high redshifts, z > 1 (for references and details, see 
Franceschini et al.|2010[ ), with the peak o f star formation and 
nuclear activity occurring at z ~ 2 (e.g., IMadau et al.||1996' 



Hopkins 2004; Silve rman et al.|2005| [Bouwens et al.|2011| ). 
A large fraction of energy emitted during these active phases 
of galaxy evolution is hidden by dust and can be detected only 
through mid- and far-IR observations. Therefore, studying the 
distant universe in the infrared provides valuable information 
on the history of assembly of present-day massive galaxies 
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(e.g. J Soifer et aTp008l I Le Floc'h et al.|2009l |Santini et al. 
2009l |Franceschini et al'pOlO) . 

In this work, we use observations of star-forming galax- 
ies made by the Spitzer Space Telescope at 24 pm. The 
Spitzer 24 pm surveys have revolutionized studies of "dis- 
tant ULIRGs" — ultraluminous infrared galaxies. These ob- 
jects are dusty star-forming ga laxies with infrared luminosity 
Lir > 10 12 L e n(e.g.,|Rigby et al.|2004HYanetal.|2005l|Daddi| 
|et al.|2007[|Fiolet et al.|2010[ |Fadda et al.|201O| >. While the 
average spectral energy distribution of high-z sources is con- 
sistent with that of present-day ULIRGs, the nature and the 
cosmologic al envi ronment hosting them must still be clarified 
(see |Huang et al.||2009"] for details and references). Various 
photometric techniques are applied to identify high-redshift 
objects amon g the thousands d e tected by wide-field Spitze r 
surveys, e.g., |Yan et aT] (|2005|), | M agliocchetti e taL] (|2007b, 
|Farrah et al] (|2008b |Lonsdale et al.| (|2009), |Fiolet et al.| 
( |2009| ), |Huang et al.| ( |2009| ), and |Dey et al.| ( |2008) . All these 
selected objects represent sub-populations of ULIRGs with 
observational characteristics partly overlapping those of star- 
forming galaxie s detected in optical and sub millimeter (see 
recent papers by Huang et al. 2009 Fiolet et al. 2009 \ . The na- 
ture of these populations has been a subject of intensive work 
based on modeling of their physical properties such as spec- 
tral energy distribution (SEP), star formatio n rate, stellar and 
halo masses, etc. (e.g., |Granato et al.|2004[|Dave et al.|2010 



Narayanan et al.|2010||Lacey et al.|2010[ ). A significant new 



observational input for such studies can be provided by mea- 
surements of the clustering amplitude, which is a unique tool 
for determination of the halo masses of high-redshift galax- 
ies. The goal of this paper is to present clustering and halo 
occupation analysis of 24 pm detected galaxies from one of 
the largest Spitzer extragalactic survey. 
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First studies on clustering of 24 pm galaxi es were made ei- 
ther in small fields with lo w statistics, e.g., Gilli et al. (2007 ) 
and Magliocchetti et al. (2008), or applying additional se- 
lection 7cntena~aTlrr|FOT^£r^r] ([2006]! and |Brodwin et aL| 
(2008). Here we improve on these first measurements by us- 
ing a large sample of ~ 20, 000 galaxies detected in the Lock- 
man Hole field, ~ 8 deg 2 , and uniformly selected only by their 
24/im flux, > 310/iJy. Our data reduction procedures 

are presented in Section [2] The clustering strength measure- 
ments of 24 pm selectedgalaxies and inferred properties of 
their dark matter (DM) halos are discussed in Sections [3] and 
[4] Comparison with previously published results is presented 
in Section [5] and our conclusions appear in Section 6. 

Throughout the paper, all cosmology-dependent quanti- 
ties are computed assuming a spatially fiat model with pa- 
rameters Q M = 0.268 and O a = 0.732 (best-fit ACDM 
parameters obtained from the combination of CMB, super- 
nov ae, baryon acoustic os cillations, and galaxy cluster data, 
2009). All distances are comoving and 



see 



ing, where the Hubble constant is 
-1 . The parameter uncertainties are 
IR l uminosities were 
1 (see|Rodighiero et al. 



Vikhlinin et al 
given with explicit /z-sca 
H = 100 /T 1 kms" 1 Mpc 
quoted at a confidence level of 68%. 
computed using Hq = 70kms~ 1 Mpc~ 
[2010] for details). 

2. THE DATA SAMPLE 

For reliable clustering measurements one needs a statisti- 
cally complete, large, and homogeneous sample of sources 
selected over a large area of the sky to probe the correlation 
signal on a wide range of scales. Th e Spitzer Wide-area I n- 
fraRed Extragalactic Survey (SWIRE, Lons dale et al.|2 003 ) is 
highly suita ble for this purpose, as w as demonstrated in sev- 
eral papers (|Wa ddington et~al.|2007 de la Torre et al.|[2007 



Farrah et al.||2006| l. It is the largest survey carried out with 



the Spitzer Space Telescope, covering ~ 49 deg in six sepa- 
rate fields in the Northern and Southern sky. Each field was 
imaged in the seven near-to-far infrared bands: InfraRed Ar- 



ray Camera (IRAC) 3.6, 4.5, 5.8, 8.0 pm ( |Fazio et al |2004 i 
and Multiband Imagin g Photometer for Spitzer (MIPS) 24, 
70, and 160 pm bands ( |Rieke et al.|2004| l. In addition to the 
infrared observations, every SWIRE field has high-quality an- 
cillary data. 

Following the goal of our work to estimate the correla- 
tion function of star-forming galaxies detected in the MIPS 
24 pm band, we first selected a sample of bright sources, 
SiA^m > 400 ply, from the SWIRE ELAIS-S1 catalog (M. 
Vaccari et al., in preparation). However, our estimated an- 
gular correlation function, w(0), showed an unexpected lack 
of clustering signal at scales < 36". There were sugges 



tions in the literature (e.g., Gilli et aL| 2007 ) that because of 
the poor angular resolution of the MIPS instrument (~ 6" 
FWHM), there could be difficulties in determining w(6) for 
faint sources due to blending. However, the deficit of close 
pairs in the sample of bright sources remained unexplained. 
This problem has no bearing on our main results presented 
below but obviously its origin needs to be understood. To this 
end, we carried out a comparison of the angular correlation 
function of the 24 pm sources selected from the four largest 
SWIRE fields (Lockman Hole, ELAIS-N1, ELAIS-N2, and 
CDFS) using two releases of the SWIRE team catalogs (ver- 
sions 2005 and 2010), and an additional source ca talog based 
on the wavelet decomposition algorithm (Section 2.1 1. This 
comparison is reported in the Appendix. Our clustering re- 
sults for 24 pm sources presented below are based on the best 



available catalog in the Lockman Hole field. 

2.1. Wavelet-based Detection of 24 pm Sources 

Due to the reasons outlined in the Appendix, we perform 
clustering analysis of 24 pm sources extracted from the pub- 
licly available MIPS images using the wavelet decomposition 
sourc e detection algorithm (wvdecomp, see Vikhlini n et al.| 
1998). This algorithm at S^A/jm ~ 300 //Jy performs nearly 
identically to the detection method used in the Final SWIRE 
Data Release (J. A. Surace et al., in preparation) in terms of 
the logN - log S distribution of detected sources and their 
angular correlation function at large scales. The only notice- 
able difference is in the treatment of very crowded regions and 
zones in the immediate vicinity of the bright sources (see the 
Appendix). These differences have no effect on our clustering 
results presented in Section 3 and 4 below. 

wvdecomp was designed to efficiently detect both point-like 
and slightly extended sources in the crowded fields. Origi- 
nally, the wavelet decomposition program was intended for 
Poisson-noise-limite d X-ray images, wher e it generally out- 
performs its rivals ( |Revnivtsev et al.|2007 1, but it was found 
that with a suitable choice of parameters, it produces good 
results also for the 24 pm MIPS images. 

First, we re-bin the archival MIPS images to 2.4" pixels (by 
a factor of two with respect to an original pixel size of 1 .2") 
to reduce the cross-correlation of noise in the adjacent pix- 
els while still maintaining the adequate sampling of the PSF 
We then convolve the image with the scale = 2 wavelet fil- 
ter, corresponding to an effective kernel width of « 5" - 6", 
matching the size of the MIPS 24 pm point sources. The rms 
of variations in this convolved image, excluding the regions 
around bright sources using cr-clipping, is the approximation 
of effective noise at the scale we are most interested in. This 
noise level is supplied to the wvdecomp program (its inter- 
nal noise determination algorithm is best suitable for the case 
Poisson statistics and thus not applicable for MIPS images). 
wvdecomp starts with the smallest scales and iteratively de- 
tects and removes detected structures from the input image, 
while adding them to the resulting "clean" image. When the 
process is finished at the given scale, it proceeds to the next 
at which the size of the wavelet kernel is increased by a fac- 
tor of two. In our case, the detection algorithm works on the 
scales corresponding to structure sizes (FWHM) of « 2.4", 
5", and 10", bracketing the range of sizes for the MIPS point 
sources. Detection threshold is set at 4.5<x, at which we expect 
~ 100 false detections in the Lockman Hole areaQ 

The main output of the wavelet decomposition algorithm 
is a list of source locations detected above a predefined SNR 
threshold, and a map which allows one to split the original 
image into "empty" regions and those with significant emis- 
sion "belonging" to a particular source. The source fluxes 
were then measured using aperture photometry. In choosing 
the aperture size, the tradeoff is between our desire to include 
as much of the source flux as possible into the aperture size, 
and the fact that for wide apertures, the flux measurements 
are increasingly affected by the larger-scale background fluc- 
tuations and by source confusion. Several tests have shown 
that the best results are achieved for an aperture size of 4", 



7 The ca libration of the false detection rate was described in Ivikhlininl 
|et al. |(1995) , and was done assuming uncorrelated Gaussian or Poisson noise 
in the image pixels. The noise properties in the SWIRE images are more 
complex but the above value is still a good order-of-magnitude estimate of 
the false-positive rate in our 24 yum sample. 
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encompassing approximately 50% of the PSF power, and cor- 
responding to the bright core of the MIPS PSF. These aper- 
ture fluxes were then converted into total flux using the PSF 
model calibrated with images of the bright stars in the same 
field. Using this method, the 24 pm sources were extracted 
from the MIPS map of the Lockman Hole field. 

2.2. The Lockman Hole Source Sample 

The Lockman Hole is the largest of the SWIRE fields. In 
addition to deeper MIPS observations (the limiting flux is 
SiApim = 310/Jy, compared, e.g., to S24pm = 400 /Jy in 
the ELAIS-S1 field, see Appendix [B] for details), it has deep 
and uniform data in many other bands. In particular, we used 
the data from the Two Micron All Sky Survey (2MA SS) sur- 
vey for the star-mask construction (see Section 2.2.1 1, and the 
optical observa tions carried out with IN T-WFC and KPNO 
MOSAIC 1 ( |Gonzalez-Solares et al.|201 \\ to photometrically 
separate the 24 /mi-selected objec ts into the low- and high- 
redshift subsamples (Section 2.2.2 1. 

We cross-correlated our sample of 24 pm sources with the 
multi-band IRAC -based catalog (limiting fluxes of S^i^m — 
7 pJy and S^s^m ^11 piy, M. Vaccari et al., in preparation) 
using a matching radius of 3.2". We then applied the fol- 
lowing flux cuts: 310 < 524pm < 2500 pJy and S 3.6pm < 
1000/Jy, and £4.5^ < 1000//Jy. $24/™ = 310/zJy is the 
flux at which the catalog is complete and the fluxes are mea- 
sured reliably and accurately. The bright flux cuts are applied 
in order to conservatively discard obviously extended and/or 
saturated sources whose astrometry may be poor and whose 
flux estimates may be affected by saturation. Only 1.7% of 
sources with ^24/jm > 310/zJy had no IRAC-couterparts. A 
small fraction of them are Galactic stars, ~ 0.3% are expected 
due to false detections for our choice of wvdecomp detection 
thresholds, the nature of the rest is unclear. In any case, their 
number is too small to affect our clustering measurements. 

2.2.1. Elimination of Stars and the Region Mask 

Galactic stars contaminate our clustering analysis of extra- 
galactic sources and should be rempvedF] To this e nd, we fol- 
lowed the pr ocedures of Shupe et al. ( 2008 1 and [Waddington 
et al. I (120071! in which the foreground stars were ide 



aL] ( [2007) in which the foreground s tars were identifi ed us- 
ing the 2MASS Point Source Catalog (|Skrutskie et al.| 



The derived 24//m-IRAC catalog was cross correlated wit] 
the 2MASS survey using a matching radius of 2.5". Shupe 
|et al.| ( |2008| l proposed that nearly all of the 24 /mi-emitting 
sources with color K s - [24] < 2.0 (Vega, mag) are Galac- 
tic stars (see their Figure 2). We applied this criterion to our 
catalog and eliminated such sources. 

In addition to directly polluting the extragalactic sample, 
bright Galactic stars may affect our clustering measurements 
indirectly, by obscuring the background galaxies or affecting 
the fluxes of the fainter galaxies near the same line of sight. 
Therefore, we need to completely exclude from the analy- 
sis the sky regions affec ted by the presence of b right fore- 
ground stars. Following Wadding ton et al.| ( |2007| l this was 
achieved by masking out the circular regions around sources 
with K s < 12 (Vega, mag) from the cross-correlated 24 pm- 
IRAC-2MASS catalog; the exclusion radius was determined 
as log(R") = 3.1 - 0.16 K s , which is the distance at which the 

8 We note, however, that the star removal is not a crucial component of 
our analysis since the contamination of near- to mid-IR galaxy samples by 
foreground stars is a severe problem only at fluxes of brighter than several 
mjy. 





■T .. • 














i 




w 


■ • • 1 


» ■ 




. 1 






■ 










>..;•. 


1 










... •* -" . . . 










*.•... 

























Fig. 1 . — Final region mask for the clustering analysis in the Lockman Hole 
field. The circles mark the locations of stars and bright objects. The rectan- 
gles mask those regions where the completeness of INT/WFC images is not 
achieved for i = 22.8 (AB mag). All blac k patc hes were excluded from the 
subsequent analysis (see details in Section |2.2.1( . 



stellar PSF merges into the background (Waddington et al. 
[2007] l. 

A close examination of the 24 pm source catalog shows 
that there are spurious detections around very bright 24 /mi 
sources (most of which correspond to Galactic stars or low-z 
galaxies). Therefore, we decided to mask out those regions as 
well. The exclusion radius was set to be 20" - 80", depending 
on the source flux. 

As we will discuss in the next section, Section [2.2.2| we 
use the INT/WFC optical data to divide our sample photo- 
metrically into the low- and high-redshift subsamples. Unfor- 
tunately, the INT/WFC observations are insufficiently deep in 
some subsections of the MIPS Lockman Hole image, and we 
had to mask out those regions also. To identify the regions 
of insufficient INT/WFC depth, we examined the distribution 
of optical counterparts for 3.6 /mi IRAC sources at various 
/-band magnitude cuts. We found that the depth is at least 
i = 22.8 throughout the field, except for the regions masked 
out as rectangles in Figure[T] At fainter magnitudes, the WFC 
coverage becomes highly nonuniform. 

The resulting mask excluding the regions around bright 
stars, extremely bright 24 pm sources and the regions of 
nonuniform optical coverage is shown in Figure [T] and was 
used in the estimation of the angular correlation function 
(Section^. The total "good" survey area is 7.9 deg 2 . 

2.2.2. Identifying Low- and High-redshift Galaxy Populations 

To derive the spatial correlation length and investigate the 
dependence of clustering on redshift, we need to know the 
redshift distribution of the sources. Unfortunately, the vast 
majority of the 24 /mi sources selected in the Lockman Hole 
field have neither spectroscopic nor photometric redshifts. 
The SWIR E photometric redshift catalog ( jRowan-Rob inson 
et al. 2008} , available in this field, has a limited and heavily in- 
homogeneous coverage for our sample. The approach we are 
taking instead is to use simple photometric criteria to divide 
the catalog into the low- and high-redshift subsamples, and 
then use a similarly selected sample of 24 pm sources from 
the GOODS survey to derive the redshift distribution within 
each subsample. 

To separate the sample into low- and high-redshift sources, 
we defined the optical-to-NIR color selection criterion based 
on the optical /-band data (from ESIS-VIMOS survey; |Berta| 
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eTaT1[2008] l and SWIRE IRAC 4.5 fim observations in the 
ELAIS-S1 SWIRE field. Particularly, we examined the de- 
pendence of the (/ - [4.5])ab color on redshift for var- 
ious galaxy spectral templates such as Mrk 231 (Sy-1), 
IRAS 19254 (Sy-2), M 82 (starburst), M 51 (spiral), and 
NGC4490 (blue spiral) (see examples of a similar analysis in 
Berta et aT| 2007 2008). It appears that for starburst galaxies, 
the color cut (/ — |4.5])ab ~ 3 separates well low (z < 1) and 
high (z > 1) redshift galaxy populations, with only a small 
contamination in both groups. Such a rapid color transition 
around z ~ 1 can be explained by the passage of the Balmer 
break in the galaxy spectra through or redward the / band. 

To further refine this color selection criterion, we applied it 
to the deep Spitzer observations of GOODS fields (Rodighiero 
et al |2010|) . The GOODS-N and GOODS-S 24 Jim catalogs 



include 889 and 614 sources, respectively, detected in a to 
tal area of ~ 350arcmin 2 . The catalogs are complete down 
to = 80/iJy. Observations in the i band were made 

by the Advanced Camera for Surveys in both fields down to 
a magnitude limit /=26.5 ( jGrazian et al.|2006) . Redshift es- 
timates are available for all these sources, 46% are spectro- 
scopic and 54% photometric redshifts. The latter are esti- 
mated with an rms scatter in z p h t - z S pec of 0.09 and 0.06 
for the G OODS-N and GOODS-S samples, respectively (for 
details see Rodighiero et al.|2010 



From the GOODS catalogs, we selected the sources with 
> 310/iJy and separated them into two redshift bins 
z > 1.2 and z < 1.2^] The color-magnitude diagram for these 
sources shows that the low- and high-z galaxies indeed can 
be separated by a boundary value of (i - 4.5) = 3 (AB mag) 
(dashed line in Figure|2ta)). The deepest optical data available 
in the Lockman Hole field are those from the INT/WFC which 
provides sufficiently uniform coverage to i = 22.8 (with the 
5cr magnitude limit reaching i = 23.3 (AB) in the deepest 
sections of the survey). Therefore, a magnitude cut of i = 22.8 
had to be incorporated in our selection. Figure|2tb) shows that 
the low-z sources fainter than i = 22.8 (above dotted line) and 
with the color (i - 4.5) < 3 (AB mag) (below dashed line) in 
practice are very few and they only minimally contaminate 
(~ 10%) the high-z sample. Based on these considerations, 
we implemented the redshift separation as a combined color 
and magnitude criterion: the source is considered to belong to 
a high-redshift sample, if it is undetectable in the INT/WFC / 
band, or its measured i magnitude is > 22.8, or the (i - 4.5) 
(AB mag) color is > 3. 

One of the main sources of concern for the color-magnitude 
based separation of 24 fim objects into low- and high-redshift 
subsamples is the presence of active galactic nuclei (AGNs) 
in the sample. Therefore, we checked the AGN contents in 
th e GO ODS sample of the 24 /miselected sources. According 
to |Rodi ghiero et al. (2010), less than 10% of these sources are 
type-1 AGNs. The authors classified the observed SEDs using 
|Polletta et IE] ( |2007| l tem plates. This AGN frac tion is consis- 



tent w ith that reported by Gilli et al. ( 2007[) and|Treister et al. 



(2006), who used very deep Chandra X-ray observations in 
the GOODS fields. Concerning the highly obscured (type- 
2) AGNs and the sources of composite spectral type (star- 
burst+ ANG), their contribution to the 24 fim emitting sources 
is hard to estimate. One of the reasons is that the AGN and star 
formation activity often occur simultaneously, and b oth are 
revealed in the form of the 24 fim emission (see, e.g., Brand 



eTaT1[2009l |Rodighiero et al.|2010)|Franceschini et al.|2005 
and references therein). Some studies suggest, on the basis of 
estimates by different methods, that the 24 pm selected sam- 
ples may contain ~20%-30% of AGNs of both types ( |Sac-| 
|chi e t al. 2009, Franceschi ni et al.|2005| >. However, we note 
that to estimate the redshift distribution within our color and 
/-magnitude-selected subsamples, we used an empirical red- 
shift distribution of identically selected GOODS sources (see 
below). As long as the GOODS redshifts are valid and the 
GOODS sample is a fair representation of our main Lock- 
man Hole sample, the derived dN/dz models for the low- and 
high-redshift subsamples are correct, even though the high-z 
subsample may be slightly contaminated by AGNs. 

2.3. Empirical Redshift Distributions 

We need a model for the redshift distribution of the sources 
in order to use the Limber equation (Equations Q and |9]) be- 
low) to relate the angular and spatial correlation functions. We 
determined these redshift distributions empirically, using the 
GOODS sources selected identically to our main sample in 
the Lockman Hole field. All sources with S24^m > 310yuJy in 
GOODS-N and GOODS-S fields were divided into low- and 
high-redshift subsample s by ap plying the color-magnitude se- 
lection criteria (Section 2.2.2| and Figure |2|b)). The obtained 
redshift distributions within these photometrically-selected 
samples are shown in Figure |3j a) and (b). These empirical 
distributions can be well approximated by a Gaussian model: 



dN/dz = C x exp(-(z • 



2mean) 2 /2(r 2 ) 



(1) 



9 The boundary was chosen near the minimum of the bim odal redshift 
distribution predicted by the Franceschini et al. I2U1UI model. 



(blue and red lines in Figure [3]). The best-fit parameters for 
the low-z subsample in the redshift range < z < 2 are 
C = 50, cr = 0.349, z mean = 0.7. For the high-z subsample 
in the redshift range 0.5 < z < 3.5, we find C = 12, cr — 
0.629, z mean = 1.7. The derived widths are significantly larger 
than the estimates uncertainties in the GOODS photometric 
redshifts (+0.06-0.09), and therefore accurately approximate 
the intrinsic widths of the redshift distributions for our two 
subsamples. 

This two-Gaussian model provides a good fit also to the red- 
shift distribution of all GOODS sources with S lAjim > 310 fiiy 
(i.e., without the photometric separation into low and high-z 
subsamples). The combined redshift distribution is shown in 
FigureH] and the dashed line is the sum of two Gaussian mod- 
els for the low and high-z subsamples. 

We also can use these subsamples of GOODS galaxies to 
estimate the typical infrared luminosities (8 /im-1000/im) for 
our Lockman Hole sample. In the GOODS low-redshift sub- 
sample, Zmean = 0.7, the mean luminosity is Lir ~ 3 x 10 n L o 
indicating that the selected objects belong to the class of lu- 
minous infrared galaxies ("LIRGs", 10 n L G < L IR < 10 12 L o , 
Sanders & Mirabel 1996). The high redshift galaxies, z mean = 
1.7, have an order of magnitude higher mean luminosity, 
Lir ~ 3 x 10 12 L o which places them into the category of ultra- 
luminous infrared galaxie s ("distant ULIRGs"; Lir > 10 12 L o ; 
Sanders & Mirabel 1996). Barring an unexpectedly high level 
of cosmic variance, our 24 fim sources selected in the Lock- 
man Hole field should have the same mean luminosities. 

3. CLUSTERING PROPERTIES OF 24 //m SELECTED GALAXIES 

The total area of the Lockman Hole field used in the cluster- 
ing analysis (white regions in Figure [lj is =s 7.9 deg 2 . There 
are 21844 24 /jm emitting objects with fluxes greater than 
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Fig. 2. — Color-magnitude (a) and magnitude ;' vs. magnitude [4.5] (b) diagrams for the GOODS-N and GOODS-S sources with SMum > 310^/Jy. Open and 
rilled circles are galaxies at redshifts lower and higher than 1.2. On both figures, a dashed line represents (i - 4.5) = 3 (AB mag). A dotted line on the figure (b) 
corresponds to a magnitude ; = 22.8 at which the INT/WFC coverage is uniform in the Lockman Hole field. 




Fig. 3. — Redshift distribution of GOODS sources (£24™ > 310/iJy) incorporated into low-z (a) and high-z (b) subsamples based on their color (i - [4.5]) and 
i-band magnitude. Blue and red lines are Gaussian fits with z mea n = 0.7 and z raca n = 1.7, respectively. 
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between perturbed and full sample realizations: 




Fig. 4. — Redshift distribution of sources brighter than S24um = 310/iJy 
from GOODS surveys. Blue and red lines are Gaussian fits to redshift distri- 
butions of sources undergone color-magnitude selection. The dashed line is 
a combined fit of two selected samples. 



310/zJy within this are a. App lying the color-magnitude se- 
lection criteria (Section 2.2.2J >, we obtained two subsamples 
of 14822 and 7022 sources with z meaa = 0.7 and z mean = 1.7, 
respectively. 

The angular correlation functions were estimated by the 
Landy & Szalay method (1993) at angular scales 0.01 < < 
3.5 deg{^] The random points used in this estimator were 
homogeneously distributed in the field but avoiding the ex- 
cluded regions of the mask shown in Figure [T] In order to 
suppress the uncertainties related to a complex geometry of 
the field and to decrease the statistical errors, the number of 
simulated random points was 100 times greater than the num- 
ber of data points in each sample. The correlation function 
was computed in angular bins A log = 0.2. In Figure B] we 
show the derived angular correlation functions for the whole 
sample (open black triangles), for the low-z subsample with 
Zmew = 0.7 (open blue circles), and for high-z subsample with 
•Zmean = 1.7 (filled red circles). 



Statistical uncertainties which can be assigned to angular 
correlation function w(0) measured using th e Landy & Sza- 
lay es timator are 5w(0) = 1 + w(f?)/VDD(f?) ( jLandy & Szalay 
1993| l, where DD is the number of data pairs. However, it is 



considered that these uncertainties do not account for cosmic 
variance and covariance of the correlation function at different 
separations, and therefore, underestimate real errors. These 
difficulties might be overcome by apply ing, for instance , the 
jackknife subsampling of data (e.g., Scranton et al.|2002{ |Ze- 



|havi et al.||2002| |Waddington et al.j|2007j |Ross et al.| |2007 ). 
To calculate jackknife errors we divided the observed field 
into 25 approximately equal-sized patches and computed the 
correlation function excluding one part of our sample at one 
time. The ensemble errors are then estimated from the scatter 

10 These angular sizes correspond to the comoving separations 0.12^43, 
0.31-109, 0.50-174, and 0.78-272 h~ l Mpc at z = 0.25, 0.7, 1.3, and 2.8, 
respectively (cf. Figure|4j. 



(T 2 (0) 



z 



DRKg) 

DR((9) 



wffl - w(d)f 



(2) 



where DR is the number of pairs between cross-correlated 
data and random catalogs, i refers to a given sample real- 
ization, and DR,/DR accounts for a complex field geometry 



( Myers et al.|2005 Ross et aL]2 007). All quoted uncertainties 
are obtained by applying the jackknife subsampling technique 
to the data, except in Appendix[Cj where we compare the cor- 
relation functions from different catalogs and calculate errors 
6w{0) (see above). 

Because of the good statistics of the SWIRE sample and 
the large size of the Lockman Hole field, we are able to 
measure the clustering signal at angular scales which corre- 
spond to fairly large spatial scales. Indeed, comoving sizes 
of 1-8 /T 1 Mpc at z = 1.7 correspond to an angular range 
of 0.017° - 0.13°. A great advantage of the measurements 
done at such large scales is that we directly probe the clus- 
tering signal at angular separations which correspond to the 
expected range of three-dimensional correlation lengths, ro. 
This makes it possible to obtain robust estimates of ro from 
a standard power-law fit to the angular correlation function, 
w(0) = (6/0o) l ~ 7 , and application of the simplified Limber 
equation (full version is given by Equation |9]l) which gives a 
direct link between the angular and spatial correlation lengths: 



= rlA{y)- 



JdzN(z) 2 H{z)D M {z) 

oo 

[ JdzN(z)] 



1-7 



(3) 



where Dm(z) is the transverse comoving distance to redshift 
z and N(z) is the redshift distribution of sample galaxies. 

H(z) = H Q V£\i(1 + z) 3 + Q t (l + z) 2 + D. A is the Hubble pa- 
rameter at redshift z and A(y) = r(l/2)r([y-l]/2)/T(y/2). If 
the angular correlation function measurements at large scales 
are unavailable, a power-law fit to the data at small angu- 
lar/spatial scales may lead to incorrect estimates of the cor- 
relation lengths and incorrect conclusions a bout clustering 
properties of given galaxy pop ulations (e.g., |Kravtsov et al 
2004l |Quadri et al. 12007] [2008] and references therein). 

The angular correlation functions shown in Figure [5] were 
iteratively fitted over the angular range 0.01° < < 3.?r with 
a power-law model, w(0) = (6>/6>o) ly - IC, where the term IC 
refers to the Integral Constraint. The IC correction accounts 
for a systematic offset in estimate d correlation fu nction due 
to the finite size of any survey (Peebles 1974 1980) and it is 
usually calculated using a method proposed by Roche et al. 
( [19931 : 

£ J RR(0 7 )0j- r 



IC = 



E>RR(^) 



(4) 



where RR(f?,) is the number of random pairs in an angular 
bin j. 

The best-fit parameters for the entire sample are 8q = 
0.31" ± 0.04", and y = 1.69 ± 0. 1 1 Splitting the whole 
sample into smaller subsamples obviously increases the sta- 
tistical uncertainties. Therefore, we decided to fix the power- 
law slope in the subsequent analysis at y = 1.69. The 

1 The uncertainties include the covariance of the parameters. 
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Fig. 5. — Two-point angular correlation function of SWIRE Lockman Hole 
sources brighter than S24ian = 310/iJy. The dotted lines are power-law fits. 
Triangles represent clustering of the whole sample, open and filled circles are 
for the low-z and high-z galaxies, respectively. 

best-fit amplitudes for the low-z and high-z data are then 
6»o = 0.63" ± 0.09" and = 0.91" ± 0.21", respectively. 
These best-fit models are shown in Figure [5] with blue and red 
dotted lines. 

The spatial correlation lengths r were then obtained from 
the Limber inversion (Equation p)) using the fits to the 
empirical redshift dis tribu tions of GOODS survey sources, 
described in Section 2.3 The derived correlation lengths 
are ro = 4.98 ± 0.28 h Mpc (comoving) for the low-z 
(zmean = 0.7), and r = 8.04 ± 0.69 /T 1 Mpc for the high-z 
(z mean = 1.7) sample. Without using a fixed power-law slope, 
we obtain r = 5.07 ± 0.34 A -1 Mpc, y = 1.63 ± 0.11, and 
r = 7.99 ± 0.75 ft" 1 Mpc, y = 1.65 ± 0.20, for the low and 
high-z subsamples, respectively. 

The uncertainties above include only statistical errors in the 
measurement of the angular correlation function. In principle, 
another source of uncertainty is the inaccuracies in the mod- 
els for the redshift distribution. These are hard to estimate in 
our case since we use an empirical fit to the dN/dz observed 
for the GOODS sources and any inaccuracies would be re- 
lated to problems with the GOODS photometric redshifts p] 
The range of theoretical models for the redshift distribution 
of 24 fim sources provides a poor guidance because these 
models, still poorly constrained by observations, sometimes 



give co ntradicto ry results (|Desai et al.|2008| |Rowan-Robinson 
|et al.|2008 ; Franceschini et al. 2010 1. Qualitatively, if the real 



dN/dz distribution for our sources is wider than what we as- 
sume, the correlation lengths should be corrected upward. 

As a further check, we re-estimated the correlation lengths 
for our high-z subsample using the redshift distribution of the 
24 fim sources in the COSMOS field ( |Sanders et al.||2007| 
ILeFloc'het aT| 2009; Ilb ert et al.|2009) . The COSMOS sur- 
vey area is significantly larger than GOODS («2 deg 2 ver- 
sus ssO.l deg 2 ) and thus is more representative of our Lock- 

12 We are unaware of such problems, and in any case, their discussion is 
beyond the scope of our work. 



man Hole region. Unfortunately, there are two problems 
which prevent us from using the COSMOS dN/dz as our 
baseline model. First, the optical and near-IR data in the 
COSMOS field are shallower than those in GOODS, which 
can affect the dN/dz distribution at high redshifts. Indeed, 
7% of the COSMOS 24 fim sources with S Ulim > 310yuJy 
have no redshifts; this is »20% of the sources in our high-z 
bin. Second, there is a signifi cant overdensity of g alaxies at 
z ~ 1 in the COSMOS field ( |de la Torre et al.|2010) . How- 
ever, even with these problems in mind, using the COSMOS- 
derived dN/dz for the estimates of ro from the Limber equa- 
tion provides a useful test of sensitivity of our results to the 
assumed shape of the redshift distribution, possible cosmic 
variance in the GOODS field, etc. We applied the same color- 
magnitude criteria to the 24 fim COSMOS sources and ap- 
proximated the redshift distribution for the high-z bin using 
either a single-Gaussian model as we do for GOODS, or two- 
Gaussian model to better fit a component near z ~ 1. We 
derive ro = 7. 90 Mpc and 8.23 hr x Mpc for these two 
dN/dz approximations, respectively; these values are to be 
compared with ro = 8.04 + 0.69 Mpc we derive using the 
GOODS dN/dz. Therefore, this test confirms that the un- 
certainties in ro related to the redshift distribution of sources 
are small compared to the purely statistical uncertainties. 

In what follows, we use the derived correlation lengths for 
the 24 fim selected galaxies for estimating the mass range of 
their host DM halos through the comparison of our measure- 
ments with the clustering properti es of DM halo s from the 
Bolshoi cosmological simulation (Kly pin et al.|201 1) . 

4. PROPERTIES OF DARK MATTER HALOS HOSTING 24 ^m 
SELECTED GALAXIES 

4. 1 . Galaxy Population Model 

Several methods can be used to connect a pop ulation of 
galax ies with that of their host DM halos (see, e.g., |Guo et al. 
2010 and references therein). Here, we use the clustering 
properties, assuming that the mass scale of the DM halos host- 
ing the galaxies can be established by requiring that the ob- 
served correlation function of galaxies selected above a lumi- 
nosity threshold matches the correlatio n function of DM ha- 
los selected above a certain mass limit (Kra vtsov et al.|2004{ 
Conroy etal.|2006i 



To compute the correlation function of the DM halos, we 
used the outputs of the Bolshoi cosmological simulation for 
redshifts ranging from 0.5 to 2.5 with a step size of Az = .5. 
The Bolshoi simulation, described in Klypin ^t al.| ( |201 l| l, is 
a high-resolution and large-volume run performed with the 
WMAP5 and WMAP7 co smological parameters Q M = 0.27, 
h = 0.7, and cr 8 = 0.82 ( jKomatsu et aij|2009l |20lT) . The 
simulation contained 2048 J « 8 billion DM particles in a 
250 h Mpc box. The corresponding mass and force reso- 
lutions are m p = 1.35 x 10 & h~ l M Q (one particle mass) and 
1.0/T kpc (the smallest cell size in physical coordinates), 
respectively. The simulation outputs were recorded at 180 
time steps and were analyzed by the halo-finding algorithm 



(Klypin & Holt zman|1997||Kravtsov et al.|2004{ [Klypin et al. 
|201 l| i to locate gravitationally bound objects and to calculate 
their characteristics such as the virial mass M v ; r , virial radius 
R v i r , maximum circular velocity v max , etc. The identified ha- 
los are classified into distinct (host, parent) halos whose cen- 
ters are not located within any larger virialized systems, and 
subhalos (satellites, substructure) which lie within the virial 
radius of a larger halo. The completeness limit for the halo 
catalogs derived from the Bolshoi outputs is v max = 50 km s _1 
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Fig. 6. — Spatial correlation length of dark matter halos as a function of the 
maximum circular velocity threshold and redshift. 



1.5 x \Q w h- x M n . 



or M v 

As outlined in |Kravtsov & Klypinl ( |1999[ ), INagai & 
|Kravtsov] p005 ), and |Conroy et al.| ( |2006| >, the maximum cir- 
cular velocity, v max , of a DM halo, rather than its virial mass, 
is more closely related to the properties of a galaxy residing in 
this halo. Therefore, we "populated" the Bolshoi simulation 
with "galaxies" by putting the "galaxies" at the centers of all 
halos and subhalos selected above a given v max threshold (this 
threshold value of v max is referred to as V^n hereafter). The 
considered range of V m [ n is 130 < V m j n < 385 km s _1 . The 
lower velocity limit is chosen so that the correlation length 
for such DM halos is below the ro derived for our low- z sub- 
sample of 24 ytzm galaxies. The high velocity limit is chosen 
to ensure that the statistics of DM halos is sufficiently good at 
all output redshifts of the Bolshoi simulation. We estimated 
the correlation lengths for the model galaxy populations by 
fitting their spatial correlation functions with a power law at 
scales 1 < r < 25 ft -1 Mpc. 

Figure [6] shows the derived model correlation lengths for 
DM halos as a function of V m j n and redshift. Clearly, the r ( > 
significantly increases with V m i n (or mass) of the halos and 
also changes with redshift. These correlation lengths can be 
matched to the observed ro for our samples of 24 fim selected 
galaxies. The redshifts of the simulation outputs do not match 
exactly the mean redshifts of our galaxy samples, z mean = 0.7 
and z mean = 1.7. However, the trend of the model ro with z for 
a given V m ; n is weakFjand so we can linearly interpolate be- 
tween the results for the outputs branching the mean redshifts 
in the data. 

4.2. Halo Mass and Number Density 

Using these data, each observed value of ro can be matched 
to the corresponding Vmin. The uncertainty intervals for our 
low- and high-z subsamples, ro = 4.98 ± 0.28 hr x Mpc and 

13 Note that ro as a function of mass does evolve with redshift, as expected. 
However, this evolution appears to be canceled by the evolution in the M - 
v max relation and the trend of ro with M at a given redshift. 



ro = 8.04 ± 0.69 h~ x Mpc, respectively, correspond to Vmin 
intervals of V m ; n = 172 + 18 km s _1 for low-z 24 fim galaxies 
and V m i n = 322 ± 33 km s _1 for the high-z subsample with 



2n 



= 1.7. 



These velocity thresholds can be easily converted to the 
corresponding virial mass limits, M v ; r , using a tight scaling, 



1/3 

which approximately goes as v max oc M ! (e.g., Klypin et al. 



|201 l| l. This relation is valid for both distinct halos and sub 
halos at different redshifts. Fitting the v max - M wu relation for 
all halos and subhalos above v max > 130 km s in the Bolshoi 
outputs, we obtain the following power-law scalings: 



logM vir =4.60 + 3.25 logv : 
log M vir = 4.69 + 3.13 logv 



m „,for z = 0.5, 
max , forz = 1.5, 



(5) 
(6) 



where M W1I is in units of h M e . These results can be scaled 
to the mean redshifts of our samples using the expected red- 
shift evolution of the v max - M v \ r relation, which goes a s 
M v ; r oc E{zY l for a fixed v max ( |Borgani & Kravtsov 2011 1, 
where E(z) - H(z)/Ho. Using these scalings, we find 
that the limiting total mass for the 24 fim emitting galaxies 
with z mean = 0.7 is M to t = (0.7 + 0.2) x 10 12 ft" 1 M^and 
M tot = (3.1 + 1.0) x 10 12 hr x M Q for our high-z sample. 

Having this established mass scale, we can approximately 
estimate the fraction of massive DM halos containing 24 //m 
emitting galaxies, even though our sample is not volume- 
limited. The observed comoving number density of the galax- 
ies near the mean redshift of the sample can be estimated as 

» g ai = ^| = LI X 10- 3 h 3 Mpc" 3 , z mean = 0.7, (7) 

n ga i = 0.12 x 10~ 3 h 3 Mpc' 3 , z mean = 1.7, (8) 

where dV/dz is the comoving volume within the survey area. 
These values are compared with the number density of ha- 
los in the Bolshoi outputs above the derived V m j n thresh- 
olds. For z = 0.5, v max > 172 km s _1 , we find «h a i 
5.0 x 10~ 3 h 3 Mpc -3 , or «haio ~ 5n ga i. For z = 1.5, v max > 
322 km s _1 , the corresponding number densities are «haio = 
0.48 x 10- 3 h 3 Mpc- 3 or n Mo * 4 n gal pi Therefore, we find 
that similar fractions, ~ 20%, of DM halos contain galaxies 
with 5 , 24 /L/m > 310/Jy at both low and high redshifts. This 
may be simply a coincidence since the mass and 24 fim lumi- 
nosity scales for the two samples are quite different and so we 
cannot separate the luminosity and redshift dependences. 

4.3. Full Limber Modeling of the Observed Angular 
Correlation Function 

Finally, we test that our analysis based on the power-law 
approximation of the observed angular correlation functions 
provides unbiased answers even though the correlation func- 
tion of DM halos shows clear deviations from the power law 
at both small and large scales (Kravtsov et al. 2004; Springel 



et al. 2005} . For this, we compute a full projection of the 



two-point spatial correlation function of the Bolshoi DM ha- 
los for v max > 172 km s _1 at z = 0.5 and v max > 322 km s _1 
at z = 1.5p] The spatial correlation functions, £(r), for the 



14 For reference, the Milky Way dark matter halo is estimated to have 
Vmax = 201 km s~' and M, ot ~ 1.4 X 10 12 /T 1 M Q (e.g., Guo et al.|20lo) . 

15 The halo number densities at the mean redshifts of our samples were de- 
termined by the interpolation using the closest output redshifts of the Bolshoi 
simulation. 

16 Note that in calculating the projected models, we neglected the redshift 
evolution of the DM halo correlation function within the redshift intervals 



CLUSTERING ANALYSIS OF 24 /mi SOURCES 



9 




Fig. 7. — Observed two-point angular correlation function for low-z (open 
circles) and high-2 (filled circles) samples of the 24 //m selected galaxies. 
The dashed and solid lines are the angular correlation function models de- 
rived from the full Limber inversion of spatial correlation functions of DM 
halos with maximum circular velocities greater than V m j„ = 172 km s and 
V mm = 322 km s -1 , 

halos were calculated at scales < r < 50 /z -1 Mpc in narrow, 
Alogr = 0.1 /r'Mpc, bins, and then were used in the full 
Limber ( |1953| l transformation: 



w{6) 



2 o 



™ "m;ix 

jdzN(z) 2 H(z) J dnt(y/[D M (z)9] 2 + n 2 ) 



2 

[ / dzN(z)] 



(9) 



where the functions are the same as in Equation and 
£(r) = £ (^J[Dm(z)8] 2 + 7T 2 ) is the three-dimensional corre- 
lation function under approximation of small angles (8 «: 1 
[rad]), n is the radial separation. The results are shown in Fig- 
ure [7] The blue and red data points (open and filled circles, 
respectively) show the observed angular correlation functions 
for the low-z and high-z samples (same as those in Figure|5j, 
and the lines are the full projections of the halo correlation 
functions for the best fit values of V m i n . 

Clearly, the full models fit the data points very well, 
confirming that the power-law approximation to the ob- 
served w(6) yields accurate estimates of the spatial correla- 
tion lengths, ro, and thus accurate mass scales for the DM 
halos hosting the 24 //mselected galaxies. At 9 > 0.2 deg 
we observed a decline of the observed correlation functions 
relative to the power-law approximations, and this could be 
related to the behav ior of the DM halos correlation function 
at large scales (e.g., Spri ngel et al.|2005] and model curves in 
Figure |7]i . 

At the opposite end, 6 < 0.01 deg, the models show en- 
hancements in the clustering signal relative to the power-law 

covered by the data. As is clear from Figure [6] the change in the clustering 
length at our V m i n thresholds is comparable to the statistical uncertainties for 
the ro measurements, so this assumption is justified. 



extrapolation from large radii. These enhancements corre- 
spond to the correlation function of galaxies located within 
a single parent halo (the so-called "one-halo" term, |Cooray| 
|& Sheth|[2002l |Kravtsov et~aLl|2004) . The measurements of 
the correlation function at these scales are very interesting be- 
cause they can be used to determine the location of galaxies in 
the host DM halos, and thus to constrain their recent merger 

20 



history (e.g. 



Porciani & Giavalisco 2002 



al. 2008; Cooray et al. 2010). 
of the MIPS instrument does r 



y Lee et al 
nfortunately, the 
not allow us to make 



2006 



Quadri et 
jroad 

reliable measurements of the clustering of 24 /jm sources at 
such small scales (see discussion in Appendix [C|. 

5. COMPARISON WITH PREVIOUS MEASUREMENTS 

It is important to compare our measurements with the pre- 
vious studies of the clustering properties of 24 //m selected 
galaxies. In doing so, we should keep in mind that direct com- 
parisons with other studies are difficult because of a wide va- 
riety of criteria used for selecting high-redshift sources. The 
comparison presented below is done in terms of the correla- 
tion lengths. We do not compare the derived halo masses be- 
cause their estimates depend on the assumptions on the cos- 
mological para meters, power s pectrum, and halo occupation 
models (e.g., |Conroy et al. 2008} , and even the definition used 
(e.g., threshold versus mean mass for a population). 

We start wit h low-redshift (z < 1) samples selected in 
small areas. Gilli et al. ( |2007 1 presented the correlation 
function measurements of the 6'24/im > 20 //Jy galaxies with 
the mean z ~ 0.8, detected in the GOODS fields. They 
found that the correlation length increases with the infrared 
luminosity, reaching for LIRGs (Lir > 10 n L o ) a level of 
ro = 5.14 + 0.76 hr l Mpc. Our estimate of ro for the low-z 
subsample (z mean = 0.7) is almost identical to this value. An- 
other study, focused on the bright 24 /mi em itting galaxies, 
was performed by Magliocchetti et al. (2008). The galaxies 
brighter than S 2 4/im = 400 /dy detected in the SWIRE XMM- 
LSS field (0.7deg 2 used in the analysis) were divided into 
low-redshift (350 sources at z mean = 0.79) and high-redshift 
(210 objects at z mean = 2.02) subsamples based on photo- 
metric redshifts. The samples are thus comparable to those 
selected in our work. The derived correlation lengths were 
5.9!};i A _1 Mpc and ll.l^/z^Mpc for the low and high-z 
subsamples, respectively. Within uncertainties, these results 
are in a reasonable agreement with our measurements. How- 
ever, our sample contains a much larger number of sources 
and covers a wider area, so we were able to measure the an- 
gular correlation func tion at larger scales (pr obing directly the 
"two-halo" term, e.g., |Cooray & Sheth 2002) and significantly 
reduce the statistical uncertainties. 

Several studies were focused on distant ULIRGs (z ~ 2) but 



they used selection criteria in addition to 24 /im flux ( Farrah 
[etaLl[2006l |Magliocchetti et al.||2007t |Brodwin et al.|2008 ), 
therefore their and our results should be compared with cau- 
tion. For example, Farrah et al.| ([2006 ) used a sample of the 
ULIRGs with S24/im > 400 /J y which also had a spectral 
peak in the 4.5 //m and 5.8 /im IRAC bands, corresponding to 
the redshifted stellar 1 .6 /jm peak. The 4.5 yum peak sources 
were estimated to be at 1.5 < z < 2.0; their derived correla- 
tion length was ro = 9.40 + 2.24 Yr l Mpc. The 5.8 /im peak 
sources are at 2 < z < 3 and their angular clustering corre- 
sponded to the correlation length of ro = 14.40± 1 .99 h~ 1 Mpc. 
The Farrah et al. ro for the 24/mi+4.5//m peak sample is 
higher than (but consistent within the errors) our value for the 
high-z sample. We note that their results (as well as those of 
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Magli occhetti et al.|2008 1 are dominated by the angular clus- 
tering measurements at small scales, and thus can be biased if 
one uses a power-law fit for the ang ular correlation function 
( |Kravtsov et al.|2 004; Qu adri et al.|2007| . In another work, a 
sample of dust obscured galaxies ("DOGs"; Dey et al.|2008) > 
was selected. DOGs are mid-IR luminous ($24/™ > 300 /J y) 
and optically faint (R - [24] > 14) galaxies estimated to be at 
z ~ 2. Their measur ed correlation length is 7.4^^ hr x Mpc 
(Bro dwin et al.|2008] ), similar to our value. 

Models of galaxy formation suggest that DOGs and sub- 
millimeter galaxies ("SMGs"; Blain et al. 2002} form by 
mergers of massive ( M tot ~ 10 12 ~ u h~ l M Q ) galaxies (see 
Narayanan et al.|2010 and references therein) and may rep- 
resent different phases in the evolution of a merging sys- 
tem. It would be interesting to compare the clustering of 
SMGs and other classes of ULIRGs, but, unfortunately, the 
present estimates of the SMG correlation length is too uncer- 
tain (|Blain et al.|[2004l |Scott et al.||2006l |Weifi et al .||2009 



Viero et al. 2009 ; Maddox et al. 2010 |Cooray et aL 



Ambl arc? et al.||20ll[ ). The best available measurements for 



2010 



submillimeter sources with redshi fts close to our hi gh-z sub- 
sample have been presented in |Cooray et al.| ( |2010| ). The au- 
thors reported a clustering strength of rrj = 3.15+0.35 /z 'Mpc 
and /-(> = 4.41 ±0.49 /r'Mpc for the HerMes-Herschel sources 
detected down to the 30 mJy at 250 fim and 500 fim. The mean 
redshift of the samples are z^ n « 2.1 and z^ n w 2.6. It is 
unlikely that these sources are directly related to our 24 fim se- 
lected galaxies because of very different values of the inferred 
correlation lengths. 

6. CONCLUSIONS 

We presented an analysis of the clustering properties of 
24 fim emitting (S upm > 310/zJy) galaxies detected in Lock- 
man Hole — one of the largest fields in the 5/?/fzer/SWIRE sur- 
vey. The large number of sources (~ 20, 000) and the size 



of the field allowed us to detect the clustering signal with 
high level of significance and probe large angular scales. Due 
to the lack of direct redshift measurements for the objects 
in the Lockman Hole sample, we used the optical and near- 
IR photometric data to separate the sample into high-redshift 
and low-redshift galaxies. The selection criteria as well as 
the redshift distributions for color-separated subsamples were 
empirically established using the catalogs of GOODS 24 fim 
sources (Rodig hTero et alj2010) , whose redshifts were mea- 
sured spectroscopically or estimated from multiband photom- 
etry. Using a power-law approximation to the correlation 
function, we derived the spatial correlation length rr> We 
found r = 4.98 + 0.28 ft -1 Mpc and r = 8.04 ± 0.69 h~ x Mpc 
for z m ean = 0.7 and z m ean = 1.7 populations, respectively. 

The estimated infrared luminosities showed that our 24 fim 
selected galaxies belong to populations of distant ULIRGs 
and local LIRGs. Based on the clustering analysis, we can 
conclude that our 24 fim selected galaxies represent different 
populations of objects found in differently sized DM halos, 
M tot > 7 x 10 11 ft -1 M and M tot > 3 x 10 12 h~ l M Q at low and 
high redshifts, respectively. In each case, the 24 fim selected 
galaxies populate ~ 20% of the halos at these mass thresh- 
olds. Their high level of mid-IR luminosities may be caused 
by similar physical processes (e.g., triggered by mergers or 
interactions), but occurring in different environments. Further 
information can be obtained by studying in detail the depen- 
dence of clustering properties on the IR luminosity at each 
redshift. 

We are grateful to A. Klypin for letting us use the outputs of 
the Bolshoi cosmological simulations. We thank C. Jones for 
careful reading of the manuscript and useful comments. S.S. 
was supported by the Smithsonian Grand Challenges Consor- 
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APPENDIX 



Below, we present a study of stability of the correlation function measurements for 24yum sources through comparison of 
different source catalogs in the SWIRE fields. In particular, we use four largest SWIRE fields (Lockman Hole, ELAIS-N1, 
ELAIS-N2, CDFS) and three catalogs - two versions of the SWIRE team catalogs (produced in 2005 a nd 2010, respectivel y) and 
our own list of sources extracted from Spitze r-MIPS maps using the wavelet decomposition method ( Vikhli nin et al.|1998 1. 



CATALOGS OF 24pm SOURCES 

The first data set we used is publicly available catalogs from the SWIRE Data Release 2 (version 2005)P] These catalogs 
consist of the optical, IRAC, and MIPS 24 fim information merged into a single table for sources detected in the IRA C 3.6 and 
4.5 //m bands above pre-defined SNR thresholds. Source detection in the MIPS data was carried out using SExtractor (Bertin & 

400/iJy in all fie lds. F or the clustering analysis, we selected all 24 fim 
we cross-correlated this set of 24 yum sources 



Arnouts 1996). The estimated completeness threshold is 
sources above this flux threshold. To eliminate Galactic stars (see Section 12.2.1 



with the objects in the 2MASS survey using a matching radius of 2.5". Hereinafter, we refer to these source catalogs (with stars 
eliminated) as the "2005 -catalog" or "v.2005". 

The second set of catalogs is based on the SWIRE Final Data Release (J. A. Surace et al. in preparation), a re-reduction of both 
the IRAC and MIPS datasets reaching a fainter flux limit. Ancillary multi-wavelength photometry from the FUV to the NIR was 
compiled for sources detected at either 3.6 //m or 4.5 fim into the so-called Data Fusion (M. Vaccari et al., i n preparation). For the 
IRAC images, the source detection was again done using SExtractor, while the MOPEX/APEX package (Makovoz & Marleau 
|2005| l was used for MIPS data. The MOPEX/APEX package was specifically optimized for detection of point-like sources in 
crowded fields, and its application results in a significant improvement in the completeness limit for MIPS data, which can be 
as low as ~ 200 fiJy (see below). The completeness of the IRAC detections was also improved compared to the previous data 
release. The initial IRAC source was associated with the data from other catalogs (e.g., the 2MASS PSC) using a matching radius 
of 2.5". In order to avoid source confusion and false identification in the 24 fim band, Vaccari et al. matched 24 /mi and IRAC 
sources within the same radius of 2.5". For our analysis, we used all these 24 fim sources, and the selected sample is referred to 
as the "2010-catalogs" or "v.2010". 

Another significant difference between the 2005- and 2010-catalogs is in the methods of flux measurements for the MIPS 
sources. The 2005 data release used the aperture photometry with a set of apertures 7.5"- 15" radius, which contained 60% -85% 
of the total flux, and applying suitable aperture corrections as determined by the MIPS instrument team. The MOPEX/APEX 
package yields the total fluxes provided by the PSF fitting. This is significant in our case because the aperture and PSF fitting 
photometry have different problems in dealing with the close source pairs, which can produce different results for the small-scale 
clustering. 

Because, as we show below, neither the 2010- nor 2005- catalogs are completely free of problems, we produced our own list 
of MlPS-detected sources (see Section 2.1 for details). This third data set i s referred to as the "A 1 -catalog" below. 

2006} in order to identify and remove 



All 24 jum-IRAC catalogs were cross correlated with the 2MASS survey ( Skrutskie et al. 
foreground stars using Shupe et al. ( 2008[ ) criterion and to built region masks (Section 2.2. l| i. It appeared that in general Galactic 
stars comprise ~ 2% to the total number of sources detected in the 24/im-IRAC bands of SWIRE images. 



17 Available at http : //swire . ipac . caltech . edu/swire 
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Fig. 8. — Number of 24/jm sources per square degree per log-flux interval plotted vs. the logarithmic flux. Left: the sources were selected from the 2010-catalogs 
(M. Vaccari et al., in preparation) in the four SWIRE fields — Lockman Hole (blue), ELAIS-N1 (green), ELAIS-N2 (red), and CDFS (magenta). Galactic stars 
were masked out and eliminated. Right: the sources were selected from three catalogs in the Lockman Hole. Blue, red, and black lines are for 2010-, 2005- and 
Al-catalogs, respectively. 



TABLE 1 
Properties of MIPS SWIRE Fields 



Field 


Siim (pJy) 


Area (deg 2 ) 




S 20W = 180 




Lockman Hole 


S 2005 = 400 


8.7 




5 A1 = 310 




ELAIS-N1 


160 


7.1 


ELAIS-N2 


170 


3.3 


CDFS 


180 


6.2 


ELAIS-S 


S A1 = 400 


6.3 



Note. — The limiting fluxes, Sii m , reported 
here correspond to the maxima in the source 
count histograms in Figure[8] 



LIMITING FLUXES FOR INDIVIDUAL CATALOGS 

For a proper comparison of the angular correlation function between different versions of the source catalogs and different 
fields, we have to make sure that the sources are selected above a flux which exceeds a completeness limit for each field/catalog. 
Ideally, a completeness limit is a flux threshold above which (nearly) all real sources are detected and into which (almost) no 
fainter sources migr ate. The exact completeness limit for the MIPS/SWIRE data can be established only through Monte Carlo 
simulations (e.g., |Shupe et al.|2008| ). However, we can apply a useful empirical criterion and identify the sensitivity limit with a 
point of maximum in the differential log N - log S distribution observed for each field/catalog. 

In Figure IS] we show the number of sources per square degree and the logarithmic flux bin contained in the 2010-catalogs for 
different SwTRE fields. The maxima in the differential log N - log S distribution in all cases are achieved near a flux of ~ 200 /iJy. 
However, there are clear differences in the number counts of faint sources up to a flux limit of S 24^ m ~ 350 /iJy. This probably 
indicates a flux measurement uncertainty of ~ 100jt/Jy, which may explain also why the drop in the differential logN - log S 
distribution below the point of maximum is not sharp but extends to ~ 100 fiJy. Therefore, based on examination of the logN 

- log S distributions, the correlation functions for the 2010-catalog in different SWIRE fields should be compared for sources 
brighter than 350 /iJy. 

In FigureJH] we show the source counts for the three different catalogs in the Lockman Hole field. There is a striking difference 
in the sensitivity limits between the 2005 and 2010 versions of the SWIRE team catalogs — the maxima in the differential logN 

- log S distributions are at $24/™ = 400 and 180/iJy, respectively. The sensitivity limit for the A 1 -catalog is between these two 
values, at * 310 yt/Jy. Note that the drop in number counts below the maximum is very sharp for the A 1 -catalog, indicating a 
high level of reliability for the flux measurements. Even though the log N - log S for the 2010-catalog extends further down, the 
flux region Sufim 5 350 /iJy in this catalog might be affected by the scatter in the source flux measurements, as we have just 
discussed. 

The sensitivity limits (the points of maxima in the differential log N - log S distribution) for different fields and catalogs are 
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Fig. 9. — Left: angular correlation function for sources from 2010-catalogs with fluxes brighter than S24mn = 350 /iJy detected in the Lockman Hole (red filled 
triangles), ELAIS-N1 (blue open circles), and CDFS (black open triangles). Right: comparison of the angular correlation functions in the Lockman Hole field 
using the sources from the 2010- and 2005-catalogs with Si^m > 400 fjjy. 



reported in Table [JJ together with the field areas after applying the stellar mask (see discussion in Section [2.2. 1) . Below, we 
lular 



compare the angular correlation function computed for different fields/catalogs taking into account these sensitivity limits. 

COMPARISON OF THE ANGULAR CORRELATION FUNCTIONS 

We start with a comparison of the angular correlation functions, w(9), computed for different SWIRE fields using the 2010- 
catalog. As discussed above, we use a flux threshold of 350 /Jy. This is the flux above which the log N - log S distributions agree 
among different fields (Figure [8}, and it is higher than the formal sensitivity limit for the 2010-catalogs. The results are shown 
in Figure |9j(left). Reassuringly, there is an excellent agreement between the results in different fields. At the largest separations, 
~ 1° and above, the angular correlation function becomes consistent with zero, but one might expect distortions at such large 
scales because they are comparable to the size of the fields we are using. More relevant to our analysis are the obvious problems 
at small scales. There is a drop in the correlation signal at 0.003° < 9 < 0.01°, and a strong positive signal located in a single bin 
at 9 ~ 0.003°. As we discuss below, these distortions are probably related to blending of nearby sources due to a relatively large 
size of the MIPS PSF 

Next, we compare the correlation functions for the 2005- and 2010-catalogs above the sensitivity limit for v.2005 (400 //Jy). 
The results for the Lockman Hole field are shown in Figure [9] (right). There is a good agreement at large scales (9 > 0.02°) but 
a strong difference at small scales. While there is a drop in the correlation signal at 0.003° < 9 < 0.01° for the 2010-catalog 
sources, there is a strong excess correlation in the same angular range for the v.2005 sources. The origin of the discrepancy is 
probably not because some real pairs at separations of ~ 30" are missing from the 2010-catalog — it is highly unlikely that this, 
more sensitive source list would miss any sources brighter that 400 /Jy. Rather, we suggest that some of these close pairs arise 
spurious ly in the 2005-ca talog because high fluxes are erroneously assigned to some faint sources in the vicinity of bright ones 
(see also |Surace et al.|2005| l. 

Next, we compare the results for the Lockman Hole field using the sources from the 2010- and A 1 -catalogs above a flux 



threshold of SiA^m - 310/iJy, the sensitivity limit of the Al- catalog. The results are shown in Figure 10 (left). The measurements 
are nearly identical at scales 6 > 0.01°, but the Al correlation function shows somewhat weaker small-scale distortions. This 
impression is confirmed by cross-examination of the source detections from both catalogs overlayed on the input MIPS image 
(Figure [TO] (right)). Most sources are found in both catalogs. There are a small number of real sources contained in one catalog 
but not the other (examples are marked by blue arrows) but this is not surprising because the source fluxes are derived using 
different methods and so we can expect some "migration" across the flux threshold. However, there are some cases (marked by 
yellow arrows) where obviously spurious sources are identified in the 2010-catalog in the vicinity of bright or extended sources. 
We believe that these detections are responsible for stronger small-scale distortions seen in the v.2010 correlation function. 

It is clear from the comparisons above that there is a good agreement in the correlation functions at larger scales, 9 > 0.01°, 
when we compare the data for different fields and catalogs above a common sensitivity threshold. The differences are localized 
to small scales and are generally trackable to problems related to blending of sources in the MIPS images because of a relatively 
poor angular resolution of this instrument. These problems are not surprising. The MIPS PSF has an FWHM of » 6" and so 
the sources become resolvable only when they are separated by ~ 10" ~ 0.003°. The MIPS PSF has wide wings — nearly 30% 
of the source flux is scattered outside the 8" (radius) aperture. Therefore, there should be a substantial "cross-talk" in the flux 
measurements for sources separated by ~ 15" (and up to 30" depending on a source extraction algorithm). In any case, it appears 
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Fig. 10. — Left: angular correlation function of 24 /jm sources from the 2010 (red filled triangles) and Al (blue filled circles) catalogs using a flux limit of 
Si4^m = 310juJy in both cases. Right: comparison of the bright sources, S24;im > 310/iTy, in the 2010- and Al-catalogs (green and red circles, respectively) 
in a subsection of the Lockman Hole field. Blue arrows point to real detections which are not present simultaneously in two catalogs. Yellow arrows indicate 
spurious detections in the 2010-catalog arising in proximity of bright/extended sources. 
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Fig. 1 1 . — Same as Figure HOW the 2010-catalog sources are selected above a flux limit of 5 24pm 
sources in the 2010-catalog Tor which the flux measurements are significantly affected by the large- 



= 180/jJy. In the right panel, yellow arrows point to the faint 
scale background variations. 



that the angular correlation function measurements for the MIPS 24/^m sources are not reliable at 6 < 0.01°, and it is best to 
restrict the analysis to larger scales. This is not a problem since our main goal is to measure the correlation length and the mass 
scale for the DM halos hosting the 24 fim sources, as these parameters are mainly constrained by the angular correlation observed 
near 6 = 0.1° (Section[3]). However, it would be interesting to put constrains on the location of star-forming galaxies within their 
DM halos, which is determined by the shape of the correlation function at small scales (e.g., Cooray & Sheth 2002 Kravtsov 
|et al.|20 04 ) and thus is not accessible for us. 

Even though the Al-catalog appears to perform better for the smallest separations above its flux threshold, $24/™ = 310/zJy, 
the difference is rather small. The 2010-catalog, on the other hand, extends to significantly fainter fluxes, and so the question 
is, can we use these fainter sources to improve the statistics in the correlation function measurements? The comparison of the 
angular correlation function measurements in the Lockman Hole field for the Al- and 2010-catalogs above their respective flux 
limits of 310 and 180 /LtJy is shown in Figure [TT] (left). Unfortunately, there are systematic deviations for the 2010 sources at 
angular scales 0.2° - 0.5° (recall that the resultsfor the two catalogs were an excellent agreement for a common flux threshold of 
3 10 /iJy, see Figure 10 1. The difference on these scales cannot be attributed to the edge effects — the size of the MIPS field in the 
Lockman Hole region is ~ 4.6 x 1.9 deg. Rather, we believe that this difference can be traced to how the large-scale structures 
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Fig. 12. — Angular correlation function of 24 pm sources from Al- catalogs in Lockman Hole (S24^m > 310juJy) (blue filled circles) and ELAIS-S1 (S24um > 
400yujy) (black open triangles). 

in the MIPS background affect the flux measurements for fainter sources in the 2010-catalog. Examination of the MIPS image 
shows that, indeed, for a significant number of sources (some marked by yellow arrows in Figure 1 1 (right)), the flux above 
180yuJy is assigned spuriously, and many such sources appear on top of larger-scale background structures. These are likely 
real sources because by construction of the 2010-catalog, they have IRAC counterparts. It is also possible that these sources 
are suitable for measurements of the luminosity function or similar studies because an approximately equal number of objects 
"migrate" below 180/iJy in those regions with the negative residual background. However, for clustering studies, these sources 
can not be used because they arise on top of spatially correlated structures and thus can distort the angular correlation function at 
intermediate scales. 



As a final test, we compare the Al-based angular correlation functions for the Lockman Hole and ELAIS-S1 field (Figure 12 1. 
The limiting flux for the Al-catalog in the ELAIS-S1 field is $24/™ = 400 /Jy. At all angular scales, the correlation function 
computed for sources above this threshold in the ELAIS-S1 field is in excellent agreement with that for the Lockman Hole field 
and S 24 ^ m > 310yuJy. 

In summary, using our own, completely independent source detection algorithm we reproduced the log N - log S at S 24pm 
300/Jy and angular correlation function results at scales 9 > 0.01° obtained for the 2010-catalog. The main analysis presented in 
this paper will lead to nearly identical results using either the 2010- or our Al-catalogs of the 24 /im sources. The most significant 
differences in the measured w(0) are localized to 9 < 0.01°. They can be traced to different treatment of very crowded regions 
and zones in the immediate vicinity of bright sources, where our detection pipeline performs slightly better (Figure 10 1. On the 
basis of these considerations, we choose our Al-catalog in Lockman Hole to investigate clustering of 24 /urn selected galaxies 
(Section^. 



